-
Notifications
You must be signed in to change notification settings - Fork 2.2k
[WIP] Add a pytorch lightning trainer #3860
Conversation
Also, if you're excited about this possibility, please put an emoji reaction on the PR description, so we know how much to prioritize this work. |
Now that you mention this, there's FeatHub that allows people to propose and vote for features. See this for an example. |
@matt-gardner how does this interact with the integration of some callbacks into the original trainer? Should I wait on that given this PR? We should have either this or the current trainer with some additional callbacks, we should not retain two trainers. I would be in favour of this pytorch lightning approach as I think it will make it easier/encourage people to use allennlp as a library. |
I think we should have both for now, until we have some evidence on what people prefer and what actually works better. This is not a |
And before you say, "two ways of doing things is confusing", I don't think so in this case, because we're not actually implementing a trainer here. We're implementing a shim that connects allennlp model and data code with the pytorch lightning trainer, just as our |
Another argument: we should definitely not make our library entirely dependent on another trainer. Because then if that other trainer decides to change things in a way that we're not happy with, we're in a pretty bad position, and our library is totally unusable. We should have a built-in trainer. What this work is trying to do is demonstrate that our data and model code are also compatible with other trainers. I'm not really sure why this is a bad thing. |
What about doing it in another repo, that has both dependencies? |
cool that you guys are adding this! happy to help if needed. |
@matt-gardner I plan to take a shot at the details in the next few days. Should I submit as a separate PR or make a pull-request into this pull-request? |
@mikerossgithub Matt is traveling at the moment, so i'd suggest a separate PR - we can merge them or close this one later. Thanks for contributing! |
I checked in a version to the pytorch-lightning library. I'm not really sure which project the code belongs in, as it needs to remain compatible with both allennlp and pytorch-lightning |
Thanks @mikerossgithub, that's great! For the next week, we're pretty focused on EMNLP submissions and our 1.0 release, but I should be able to look at what you have and what else is left in a week or two. I'm fine with this code living in either place; @williamFalcon, any opinion on which repo should have it? |
Within the next week, I'm going to make a github template repository that gives an easy starting place for someone who wants to use pytorch lightning with allennlp data and model code. It'll be similar to this, but with the fixes I mentioned above. A template project seems like the right way to provide this kind of integration. Given this planned work, I'm closing this PR. |
@mikerossgithub (@matt-gardner) Depends what you are looking for:
In terms of using lightning as a backbone, we're heavily used at many labs and companies. As such, we work closely with them to make sure they have what they need. If there's something specific you might want just ping me on our Slack. The downsize of having a parallel trainer is that we are moving very fast and adding new features/testing/support which you'll likely miss out on. For instance we now support TPU training and are currently working on doing TPU testing on CI as well. Not to mention all the integrations and other partners that we have, in addition to the almost 200 contributors + 8 core members + staff members on the lightning team working around the clock to add new features and make the library more robust :) |
Hi @matt-gardner any updates on this project? |
matt and I spoke a few weeks back. We’re allocating some resources to build the templates for the team. cc @nateraw |
I'm still noodling around, but in about 20 minutes I was able to get a simple test passing. This needs a lot of work to actually be usable, but the proof of concept is here, it doesn't look like it's very much work, or very much code that we'd have to maintain.
I only used
Params
here because that's what the test I was copying did. Still to do:Params
bit out of the code that's added here, and just make aTrainWithLightning
object that takes concrete objects as arguments (that we can constructFromParams
if anyone wants to use that pipeline). This probably also requires aRegistrable
AllenNlpLightningModule
that can be subclassed for different configuration.Bottom line: this is definitely possible, and should be very easy to support. I don't know exactly what that support should look like, as I have no experience with this other trainer. If you have specific requests, comment below. I also will be out of town for a week and likely won't be working on this for a little bit, so if anyone is super eager and wants to try fleshing this out, let me know.